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Abstract — This paper focuses on the design of 
multiplier-less decimation filters suitable for oversampled 
digital signals. The aim is twofold. On one hand, it 
proposes an optimization framework for the design of 
constituent decimation filters in a general multistage deci- 
mation architecture. The basic building blocks embedded 
in the proposed filters belong, for a simple reason, to the 
class of cyclotomic polynomials (CPs): the first 104 CPs 
have a z-transfer function whose coefficients are simply 
{ 1,0, +1}. On the other hand, the paper provides a 
bunch of useful techniques, most of which stemming from 
some key properties of CPs, for designing the proposed 
filters in a variety of architectures. Both recursive and 
non-recursive architectures are discussed by focusing on 
a specific decimation filter obtained as a result of the 
optimization algorithm. 

Design guidelines are provided with the aim to simplify 
the design of the constituent decimation filters in the 
multistage chain. 

Index Terms — A/D converter, CIC, cyclotomic, comb, 
decimation, decimation filter, multistage, polynomial, 
sigma-delta, sine filters. 



I. Introduction and Problem Formulation 

The design of multistage decimation filters for over- 
sampled signals is a well-known research topic 
Mainly inspired by the need of computationally effi- 
cient architectures for wide-band, multi-standard, re- 
configurable receiver design, this research topic has 
recently garnered new emphasis in the scientific com- 
munity llJ-llS]. Multistage decimation filters are also 
employed for decimating highly oversampled signals 
from noise-shaping EA A/D converters |6|. 

Given a base-band analog input signal x{t) with 
bandwidth [— i?^, an A/D converter produces a 

digital signal x{nTo) by sampling x{t) at rate Jo = 
= 2pBx 3> 2Bx, whereby p > 1 is the oversam- 
pling ratio (notice that p > 1 for oversampled signals). 
The normalized maximum frequency contained in the 
input signal is defined as f° — = j^, and the 
digital signal x{nTo) at the input of the first decimation 
filter has frequency components belonging to the range 



The author is with the Dipartimento di Elettronica. Politecnico 
di Torino, Corso Duca degli Abruzzi 24, I0I29 Torino, Italy. E- 
mail: laddomada@polito.it 



[— /°,/°]. This setup is pictorially depicted in the 
reference architecture shown in Fig. [T] 

Owing to the condition p ^ 1, the decimation 
of an oversampled signal x{nTo) is efficiently [I] 
accomplished by cascading two (or more) decimation 
stages as highlighted in Fig. [T] in which a multistage 
architecture composed by m decimation stages is 
shown as reference scheme. Consider an oversampling 
ratio p which can be factorized as follows: 



P = 



Ha 



whereby, for any i, Di is an appropriate integer strictly 
greater than zero. 

In the general architecture shown in Fig.lT] sampling 
rate decreases in m consecutive stages, whereby the 
sampling rate at the input of the ith stage is 



fi-i = fi ■ Di, Vi = 1, . . 
while the output sample data rate is: 

fo 



, m 



1, 



The design of any decimation stage in a multistage 
architecture imposes stringent constraints on the shape 
of the frequency response over the so-called folding 
bands. Considering the scheme in Fig.lT] the frequency 
response Hi{e^'^) of the ith decimation filter must 
attenuate the quantization noise (QN) falling inside the 
frequency ranges defined as 



k 1 . k 

Di Jc 1 Di 

kM = L-tJ, 



Di even 
Di odd 



M 



(1) 



whereby /*^^ is the normalized signal bandwidth at 
the input of the ith decimation filter The reason is 
simple: the QN falling inside these frequency bands 
will fold down to baseband (i.e., inside the useful 
signal bandwidth [—fc^^,+fc~^]) because of the 
sampling rate reduction by Di in the ith decimation 
stage, irremediably affecting the signal resolution after 
the multistage decimation chain. 



p=D|D2 D, D,, 




Fig. 1. General architecture of a m-stage decimation chain for A/D converters, along with a pictorial representation of the key frequency 
intervals to be carefully considered for the design of the ith decimation stage. The sampling rate at the input of the ith decimation stage 
is fi-i, Vi = 1, . . . ,m. 



On the other hand, frequency ranges labelled as don 't 
care bands in Fig. [T] do not require a stringent selec- 
tivity since the QN within these bands will be rejected 
by the subsequent filters in the multistage chain. 
The relation between /* and /° is as follows: 

fl^p-^D,, Vz-l,...,m 

whereby it is /° = l/2p. 

The ith decimation filter Hi{e^'^) introduces a pass- 
band ripple Sp which can also be expressed in dB as 
follows 

^; = -201ogio(^) >0 (2) 
while the selectivity (in dB) corresponds to 

As = 20 logio {jTs^) ~ ^^^^ ^ ^ 

With this background, let us provide a quick survey of 
the recent literature related to the problem addressed 
here. This survey is by no means exhaustive and is 
meant to simply provide a sampling of the literature 
in this fertile area. 

Excellent tutorials on the design of multirate filters 
can be found in Q, fS), while an essential book on 
this topic is [T|. Recently, Coffey ID, (|Tol addressed 
the design of optimized multistage decimation and 
interpolation filters. 

The design of cascade-integrator comb (CIC) filters 
was first addressed in |11|, while multirate archi- 
tectures embedding comb filters have been discussed 
in |12|. Since then, many papers [13] have focused on 
the computational optimization of CIC filters even in 
the light of new wide-band and recofigurable receiver 
design applications II14I - II16I . Comb filters have been 



then generalized in |fT7l - l|20l , especially in relation to 
the decimation of SA modulated signals. 

Other works somewhat related to the topic addressed 
in this paper are 1*211- 1271 . The use of decimation 
sharpened filters embedding comb filters is addressed 
in ['20I- II2TI . while in [22] authors proposed compu- 
tational efficient decimation filter architectures using 
polyphase decomposition of comb filters. Dolecek et 
al. proposed a novel two-stage sharpened comb deci- 
mator in [23J. The design of FIR filters using cyclo- 
tomic polynomial (CP) prefilters has been addressed 
in 1 24 1, while effective algorithms for the design of 
low-complexity FIR filters embedding CP prefilters 
have been proposed in Il25l - ll27l . 

Owing to the discussion on the folding bands 
presented above, this paper addresses the design of 
computationally efficient decimation filters suitable for 
oversampled digital signals. Natural eligible blocks 
used in filter design are cyclotomic polynomials with 
order less than 105, since these polynomials possess 
coefficients belonging to the set { — 1,0,+!}. We first 
recall the basic properties of CPs in Section |ll] since 
these properties suggest useful hints at the basis of the 
practical implementation of the designed decimation 
filters. For conciseness, we address the design of the 
first stage in the multistage architecture, even though 
the considerations which follow are easily applicable 
to any other stage in the chain. 

The computational complexity of basic CP filters is 
discussed in Section |Ill] In Section |IV] we propose an 
optimization framework whose main aim is to design 
an optimal decimation filter (optimal in that the cost 
function to be minimized accounts for the number of 
additions required by the chosen CP filter) featuring 
high selectivity within the folding bands seen from 
the ith decimation stage. 
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TABLE I 

Values of the totient function for n g [1, 69]. Symbol 1 

IS USED TO SIGNIFY THE FACT THAT THE UNDERLINED NUMBER 
IS ASSOCIATED TO n = 1, WHILE FOLLOWING NUMBERS ARE 
ASSOCIATED TO INCREASING VALUES OF 71. 

i, 1, 2, 2, 4, 2, 6, 4, 6, 4, 10, 4, 12, 6, 8, 8, 16, 6, 18, 8, 12, 
10, 22, 8, 20, 12, 18, 12, 28, 8, 30, 16, 20, 16, 24, 12, 36, 18, 
24, 16, 40, 12, 42, 20, 24, 22, 46, 16, 42, 20, 32, 24, 52, 18, 
40, 24, 36, 28, 58, 16, 60, 30, 36, 32, 48, 20, 66, 32, 44 



TABLE II 

Values of the Mobius function for n e [1, 104] . 



/i(n) 


n 




2 3 5 7 11 13 17 IQ 23 7Q 30 31 37 41 42 
43, 47, 53, 59, 61, 66, 67, 70, 71, 73, 78, 79, 83, 89, 
97, 101, 102, 103 


1 


1, 6, 10, 14, 15, 21, 22, 26, 33, 34, 35, 38, 39, 46, 
51, 55, 57, 58, 62, 65, 69, 74, 77, 82, 85, 86, 87, 
91, 93, 94, 95 



4, 8, 9, 12, 16, 18, 20, 24, 25, 27, 28, 32, 36, 40, 44, 



45, 48, 49, 50, 52, 54, 56, 60, 63, 64. 68, 72, 75, 76, 
80, 81, 84, 88, 90, 92, 96, 98, 99, 100, 104 



The practical implementation of the designed decima- 
tion filters is addressed in Section [Vl whereby both re- 
cursive and non-recursive architectures stemming from 
a variety of properties of polynomials, are discussed. 
Finally, Section [VTl draws the conclusions. 

II. Basics of Cyclotomic Polynomials and 
Key Properties 

Cyclotomic polynomials (CPs) arose hand in hand 
with the old Greek problem of dividing a circle in 
equal parts. Key properties of such polynomials along 
with the basic rationales can be found in various 
number theory books (we invite the interested readers 
to refer to ||28]| . ||29| ). other than in some recent 
papers 1241 . Given an integer D strictly greater than 
zero, polynomial (l — z^^) can be factorized as a 
product of cyclotomic polynomials as follows: 



-D 



- n 

q:q\D 



(4) 



whereby q : q\D identifies the set of integers q, less 
than, or equal to D, which divides D (in other words, 
the remainder of the division between D and q is zero). 
For each q as above, there is a unique polynomial 
Cq (z) whose roots satisfy the following conditions. 

• For each q < D, the roots of Cq (z) constitute a 
subset of the roots belonging to the polynomial 

• The roots of Cg (z) are the primitive qth roots of 
unity, i.e., they all fall on the z-plane unit circle. 

• The number of roots corresponds to the number 
of positive integers which are prime with respect 
to D, and smaller than D. 

• Roots of Cg (z) do not belong to the set of roots 
of the polynomial 1 — z^'', Vr : < r < q < D. 

Based on the observations above, polynomials Cq (z) 
are defined as: 



n 



1 - Z 



-1 -7"27r-i- 

e 1 



(5) 



whereby {i,q) — lis used to mean that i and q are co- 
prime ll28l . Notice that, given an integer q, ^ allows 



us to write the z -transfer function of any CP indexed 
by q. 

Key advantages of CPs in connection to filter design 
rely on the following property: if q has no more than 
two distinct odd prime factors, polynomials Cq (z) 
contains coefficients belonging to the set { — 1,0,+!}. 
From a practical point of view, CP coefficients belong 
to the set {-1, 0, +1} if g < 104 iH, ll29ll. 

The degree of polynomial Cq (z) is not q but it is 
defined as follows: 



deg [Cq (z)] 



d\q 



(6) 



whereby (f){q) is the totient function (see Table IT]), i.e., 
the number of positive integers less or equal to q that 
are relatively prim^H to q, while /i (n) is the Mobius 
function defined as: 



1, 



1 



0, 



n = Pi ■ p2 ■ ■ ■ ■ ■ Pk, 
with Pi prime, pi ^ pj, Vi ^ j 
if n is divisible 
by the squares of a prime 

(7) 

Index k in the second entry stands for the number of 
distinct prime numbers which decomposes the argu- 
ment 71. Values of the IVIobius function are shown in 
Tableinifor n e [1, 104]. Notice that (n) ^ impHes 
that n is squarefree, i.e., its decomposition does not 
contain repeated factors. 

The z-transfer function of a CP with squarefree 
index q is |30|: 



Cq{z) 



E 

d=0 



Cq.dZ 



-(4>(q)-d) 



(8) 



whereby coefficients Cq ^ can be evaluated with the 
following recursive relation: 



Cq^d 



d-1 



E ^'}'P ' ^ ^Sil, d~p))(f> {g{q, d - p)) 



p=0 



(9) 



'Two numbers are said to be relatively prime if they do not contain 
any common factor. Notice tiiat tiie integer 1 is considered as being 
relatively prime to any integer number. 
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using the initial value c^.o = 1- Function g{q, d — p) 
in (|9]l is the greatest common divisor between q and 
d — p. Notice that (|9]l represents an effective algorithm 
for automatically generating the z-transfer function of 
CPs with squarefree indexes q. 

Perhaps, the main properties useful for deducing the 
z-transfer function of any CP, are the ones summarized 
in the following lf28l . We will discuss the application 
of such properties in Section |IIl] whereby the focus is 
on the design of low complexity CPs in terms of both 
additions and delays. 

1) Given a prime number t, it is 

= = (10) 

2 = 

2) Let k, n and m be three positive integers. Then, 
it is 

C„ini' {z) ~ Cmn ^ (H) 

3) Consider a prime number p, which does not 
divide q, then 

4) Given any odd integer n greater or equal to 3, 
then it is 

C2n{z) = Cni-Z) (13) 

5) For z = 1, the following relation holds: 

r 0, q^l 
Cq{l)^< p, (7=/, p prime (14) 
[ 1, otherwise 

This relation assures us that for indexes q > 1, 
z-transfer function of the respective CP presents 
unity gain in baseband provided that q / p''. 
Otherwise, CP transfer functions have to be 
normalized by p in order to assure unity gain 
in baseband. 

III. Criteria for Identifying Low 
Complexity CPs 

The z-transfer function of CPs for any index q can be 
deduced upon employing the relation (|5]l along with 
the properties stated in (fT0t-(fT3ll. Different architec- 
tures (both recursive and non recursive) for imple- 
menting each CP can be obtained, mainly differing 
in the number of additions and delays required. For 
conciseness, in this paper we show the z-transfer 
functions of the first sixty CPs in Table IIVI the z- 
transfer functions of Cq{z) for any q £ {1, . . . , 104} 
in both non recursive and recursive (if any) form can 
be found in [31]. 

Let us discuss some key examples by starting from 
CP C33(z). Considering that 33 is squarefree and given 
that p = 33 can be written as 3x11, whereby 3 and 11 



are coprimes, there are three possible architectures for 
implementing such a polynomial. The first one stems 
from (O and (|9]l and it consists of a non recursive 
architecture (see Table llVb employing 14 additions 
and 20 delays. On the other hand, two recursive 
architectures follow upon using property (fT2] | with 
p = 3, q = 11 and p = 11, g = 3: 





l-z-^'^ l-z-^ 


~ Cii(z) 


~ l-z-i" 1-z-ii 


l-Z-l-Z-=*3_^Z-3* 








C-6(Z) 


1+Z-1-+Z-2 



(15) 

As far as the number of additions is concerned, from 
(fTSl l it easily follows that the architecture Cj,.ii{z) 
only requires 4 additions, which compares favor- 
ably with both the non recursive implementation and 
Cii.3(z). Notice also that, since CP coefficients are 
simply { — 1, 0, +1}, the recursive architectures can be 
implemented without coefficient quantization; this in 
turn suggests that exact pole-zero cancellation is not a 
concern with these architectures. 

On the other hand, the non recursive architecture 
requires only 20 delays as opposed to the recursive 
architectures requiring, respectively, 34 and 22 delays. 
In this work, we suppose that the computational com- 
plexity of the filter depends only on the number of 
additions. 

Upon comparing for any q both recursive and 
non recursive architectures in Table |IV] (see also the 
complete list of the first CPs reported in OH), it 
easily follows that recursive implementations, when do 
exist, allow the reduction of the number of additions 
with respect to non recursive implementations; the 
price to pay, however, relies on the increased filter 
delay. As a rule of thumb, non recursive architectures 
should be preferred to recursive implementations when 
memory space is a design constraint. On the other 
hand, recursive architectures can greatly reduce the 
number of additions. 

Let us briefly discuss the possible architectures 
related to an even indexed CP, such as C6q(z). By 
virtue of the different ways to factorize the integer 60, 
property (fT2T l can be applied with the following combi- 
nations p = 5, g = 12, p = 3, g = 20 whereby in both 
cases p is a prime integer not dividing q. Property (fTTT l 
can be applied with to = 15, n = 2, fc = 2. In Table IIVI 
we show only both the recursive and the non recursive 
architectures yielding the lowest complexities. 

When (7 is a prime number, the z-transfer function 
of the related CP corresponds to the first order comb 
filter, as can be straightforwardly seen from (fTOl i. 
Finally, property ( fTST l can be effectively employed for 
deducing the z-transfer function of CPs with even 
indexes q which can be written as 2n, with n an 
odd number strictly greater than 2. As an example. 
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notice the following relations: C3o{z) = Ci5(— z). 

The simple examples presented above are by no 
means a complete picture of the capabilities and so- 
phistication that can be found in multistage structures 
for sampling rate conversion. They are merely in- 
tended to show why such structures can constitute the 
starting point for obtaining computationally efficient 
filters for decimating oversampled signals. The design 
of computationally efficient decimation filters relies 
on the combination of an appropriate set of CPs. 
In oversampled A/D converters, for example, it is 
very important to contain the computational burden 
of the first stages in the multistage decimation chain. 
This motivates the study of an effective algorithm for 
identifying an appropriate set of CPs that, cascaded, 
is able to attain a set of prescribed requirements as 
specified in Q and (|3]i: this is the topic addressed in 
the next section. 

IV. Optimization Algorithm and Design 
Examples 

This section presents an optimization framework for 
designing low complexity decimation filters, Hi{z), as 
a cascade of CP subfilters. For the derivations which 
follow, consider the design of the ith decimation filter 
in the multistage chain depicted in Fig. [T] with a 
frequency response that can be represented as follows: 

\Scp I 

H^ifd)=l[C^-{fd) (16) 

9=1 

whereby fd is the digital frequency normalized with 
respect to the sampling frequency as discussed 
in Section U Sep is a suitable set of eligible CPs to 
be used in the optimization framework {\Scp\ is the 
cardinality of the set, i.e., the number of eligible CPs), 
Cq{fd) is the frequency response of the CP indexed by 
q and irig is its integer order in the cascade constituting 
H, ifd) (it is rriq > 0, Vq). 

A suitable cost function accounting for the com- 
plexity of the ith decimation filter can be defined as 
a weighted combination of the number of adders and 
delays required by the overall filter Hi{z) Il26l : 

F (mi, 7712, ■ • ■ ,"7[s,p|) = ^ mq-{Na^q+l ■ Nd,q) 

q=l 

(17) 

whereby Na.q and Nd.q are, respectively, the number 
of adders and delays of CP Cq{z), and 7 G [0,1] 
is a factor depending on the relative complexity of 
the delays with respect to the adders. In our setup, 
we assume that the computational complexity of the 
ith decimation filter is mainly due to the number of 
adders; therefore, we set 7 = 0. Notice that the cost 
function depends on the CP orders mi, . . . ,m\s^p\. 



while Na.q and Nd^q are known once the set Sep of 
eligible CPs has been appropriately identified. Notice 
also that Na.q and Nd^q can be straightforwardly 
obtained by Table |IV] (see also ||3TI for a list of all 
104 CPs). 

Let us address the choice of the eligible CPs in the 
set Sep- This is one of the most important design step 
since the complexity of the optimization framework 
discussed below, is tied tightly to the number of 
eligible CPs. By virtue of the discussion on the folding 
bands spanned by the ith decimation filter, we choose 
the eligible CPs between the 104 CPs in such a way 
that 1) at least 20% of zeros falls within the folding 
bands defined in ([T]i, 2) no zero falls in the signal pass- 
band ranging from to /°. As a result of extensive 
tests, we adopted such a threshold which is capable 
to reject about 20 — 60 initial CPs depending on D. 
Of course, lower thresholds can increase the number 
of eligible CPs at the cost of an increased complexity 
of the optimization framework discussed below. On 
the other hand, when designing the ith decimation 
filter in a multistage architecture, only the so-called 
folding bands must be spanned by zeros, since don't 
care frequency bands will be appropriately spanned 
by the zeros belonging to the subsequent decimation 
filters in the cascade. 

Before presenting the optimization algorithm, let 
us discuss the requirements imposed to the frequency 
response Hi{fd) of the ith decimation filter in the cas- 
cade. Mask specifications [1] are given as for classical 
filters as far as the passband ripple is concerned. In 
particular, for the optimization algorithm we use the 
passband ripple expressed in dB as specified in (|2]i. 
The main difference between the design proposed in 
this work and classical FIR filter design techniques 
relies on the fact that in our setup specifications are 
only imposed in the folding bands ([TJ. To this end, we 
evaluated the lowest attenuations (worst-case) attained 
by each CP belonging to Sep in each folding band: 

Ad^ = -max^^gjQ.^^.-i^ 201ogio {\Cqifd)\n) 

As(k,q) = min^^^^_^__^._i._^^^,_ij 201ogio {\Cq{fd)\n) 

(18) 

whereby subscript n signifies the fact that each CP Cq 
has been normalized in such a way as to have unity 
gain in baseband. Notice that normalization factors can 
be deduced from (fT4l i. As {k, q) is the worst attenuation 
of the qth CP in Sep within the fcth folding band, with 
fc G {1, . . . , kni}, and kM defined in ([T]). Such values 
(in dB) have been stored in look-up tables. 

Once the set Sep of eligible CPs along with the 
appropriate specifications (passband ripple and folding 
band attenuations) have been identified, the optimiza- 
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TABLE III 
Optimization results 



D = 8 


Set of eligible CPs: 2, 4, 8, 9, 11, 15, 17, 18, 19, 21, 22, 25, 27, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39, 41, 
42, 43, 44, 45, 47, 49, 50, 51, 53, 54, 55, 57, 58, 59, 60, 61, 62, 63, 64 


As = 40, Rp = IdB 
As = 50, Rp = IdB 
As = 60, Rp = IdB 


Hd8,i{z) = C2iz)C4,{z)Cl{z)Cii{z) 
HdsM^) = C^{z)c}{z)Ci{z)C9iz) 


As = 40, Rp = 2dB 
As = 50, Rp = 2dB 
As = 60, i?p = 2dB 


HdsM^) = C4{z)Cs{z)Cii{z)Ci7{z) 
HdsM^) = C2iz)Ciiz)C^iz)Ci9iz) 
Hds,6{z) = C2{z)C4{z)Ci{z)Cii{z)Ci7{z) 


D = 16 


Set of eligible CPs: 2, 4, 8, 9, 11, 15, 16, 17, 18, 19, 21, 22, 25, 27, 29, 30, 31, 33, 34, 35, 36, 37, 38, 39, 
41, 42, 43, 44, 45, 47, 49, 50, 51, 53, 54, 55, 57, 58, 59, 60, 61, 62, 63, 65, 66, 67, 68, 69, 70, 71, 72, 73, 
74, 75, 76, 77, 78, 79, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 100, 101, 102, 103 


As = 40, Rp = IdB 
As = 50, Rp = IdB 
As = 60, Rp = IdB 


Hdi6,i(z) = Cs{z)Ci(i{z)Ci7{z)Ci9{z) 
Hd16,2{^) = Ciiiz)Ci6{z)Cl^iz) 
HdwM^) = C4z)Ci6{z)Cfr{z) 


As = 40, iJp = 2dB 
As = 50, iJp = 2dB 
As = 60, Rp = 2dB 


HD16Ai^) = Ci6{z)C-^g{z) 

Hd16.5{z) = Cs{z)Cl6{z)Cl7{z)C4l{z) 

HdwMz) = Cw{z)C^^iz)C37iz) 


D = 32 


Set of eligible CPs:2, 4, 8, 9, 11, 15, 16, 17, 18, 19, 21, 22, 25, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 
39, 41, 42, 43, 44, 45, 47, 49, 50, 51, 53, 54, 55, 57, 58, 59, 60, 61, 62, 63, 65, 66, 67, 68, 69, 70, 71, 
72, 73, 74, 75, 76, 77, 78, 79, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 
100, 101, 102, 103 


As = 40, Rp = IdB 
As = 50, i?p = IdB 
As = 60, Rp = IdB 


HD32,liz) = C'iiiz)C4liz) 
Hd32,2{z) = C25{z)CI^{z) 
Hd32.3{z) = Ct7{z) 


As = 40, Rp = 2dB 
As = 50, Rp = 2dB 
As = 60, Rp = 2dB 


Hd32a(z) = C3l{z)C53{z)C67{z) 
Hd32Mz) = Cl6{z)Cl^{z)C79{z) 
Hd32.6{z) = C?r(z)C37iz)C67{z) 



tion problem can be formulated as follows: 



matrix A: 



iiiiiimi,...,m|s^ 

subject to: 



F (mi, . . ■,m\s,^\) U=o in dHl) 



0) E.^T^.M 

1) j:l=irn,Ml,q) 



< 
< 



Rp (ripple) 
As (selectivity) 



k) Eq=rmMk,q) < As 
kn) T}timqAs{kM,q) < As 



(19) 

The optimization problem can be also solved for dif- 
ferent prescribed selectivities, As (as specified in Q), 
around the various folding bands. In this work we do 
not pursue this approach. However, notice that such an 
approach can be effective for noise shaping EA A/D 
converters which present an increasing noise power 
spectra density for higher and higher values of the 
digital frequency fd< 1/2 0, lITSl . Setting increasing 
values of I As I in correspondence of successive folding 
bands can mitigate noise folding due to the decimation 
process. 

The solution to the optimization problem (fT9] ) is the 
set of CP orders m — [toi, . . . , m|s^p|]"^, whereby 
rrii = signifies the fact that the ith CP in Sep is 
not employed for synthesizing Hi{fd). 

Upon collecting the set of fc^/ + 1 conditions in the 



/ Ad, 
^(1,1) 

V As{kM^) 



As{lAS, 



cpl 



As{kM, \ Scp\) j 



and the requirements b = \Rp As ... A^ , the 
constraints in ( fT9] l can be rewritten as follows: 

Am < b 

By this setup, the optimization problem in ( fT9b with 
respect tOTOi,...,mir^si can be rewritten as 



subject to: 
Am < b 

TOj > 0, 771 



'"'ICS I 



integer, Vi = 1, 



15, 



cpl 



and solved by mixed integer linear programming tech- 
niques ll32l . We solved the optimization problem using 
the Matlab function linprog along with a new matlab 
file capable of managing integer constrained solutions 
(the latter file is available online |[33l). 

The results of the previous optimization problem are 
summarized in Table HU] for various Ag specifications 
and two different values of Rp, namely Rp — 1 and 
2 dB. We solved the problem for three different values 
of the decimation factor D of the first stage in the dec- 
imation chain depicted in Fig. [T| by assuming that the 
residual decimation factor is = 4 (in other words, we 
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0.2 0.4 0.2 0.4 0.2 0.4 



'a 'a la 

Fig. 2. Behaviours in dB of the modulo of the frequency responses 
Hs,l{fd), Hs,2{fd), Hs,3{fd) of the optimized decimation filters 
shown in Table |ffl] for D = 8. 




Fig. 3. Behaviours in dB of the modulo of the frequency responses 
Hie,i{fd), Hi6,2ifd),Iiia.3{fd) of the optimized decimation fil- 
ters shown in Table ITlTl for D = 16. 



assumed that p ^ D - A). Notice that such an approach 
is quite usual in practice in that the first decimation 
fiher accompUshes the highest possible decimation in 
order to reduce the sampling rate, while the subsequent 
decimation stages are usually accomplished with half- 
band filters each one decimating by 2 |1 1. 

The first row related to any decimation factor shows 
the set of eligible CPs found in the preliminary design 
step discussed above, while the z-transfer functions of 
the CPs can be found in Table ITV] (see also 131] for a 
list of all 104 CPs). 

It is worth comparing the frequency responses of 
the optimized filters iJg.iC/d) and i?i6.i(/d) (for i — 
1,2,3) in Table Hill with the specifications Rp — IdB 
and various As- To this end, Fig.s |2] and |3] show, 
respectively, the behaviours of the frequency responses 
Hs.iifd) and i?i6.i(/d) along with the imposed selec- 
tivity As around the various folding bands (identified 
by horizontal bold lines). 



V. Implementation Issues 

This section addresses the design of optimized CP- 
based decimation filters. For conciseness, we will 
focus on the design of decimation filter Hs.2{z) shown 
in Table [III] even though the considerations which 
follow can be applied to any other decimation filter 
quite straightforwardly. The decimation stage related 
to Hs,2{z) is depicted in Fig.H^: this decimation filter 
will be designed through a variety of architectures fol- 
lowing from different mathematical ways to simplifies 
the analytical relation defining Hs.2{z)- 

First of all, notice that upon substituting the ap- 
propriate equations of the constituent CP filters in 
Hs.2{z), the designed filter takes on the following 



expression: 

HsM.^^) = Ci(z)CUz)Ci{z) = (20) 
= (l + z-i)'(l + z-2)'(l + z-4)3 

which can be rewritten as follows: 

HsM^) = ,\ _i ^ (21) 

From the commutative property employed in lfT2l . 
the cascaded implementation shown in Fig. |4j) easily 
follows. The rth stage in Fig. |4j5 operates at the 
sampling rate /i_i/2'', whereby is the data 

sampling frequency at the filter input as shown in 
the multistage architecture in Fig. [T| Further power 
consumption reduction can be achieved by applying 
polyphase decomposition to the architecture shown in 
Fig. I^^. To this aim, consider the z-transfer function 
of the 3rd order cell: 

{l + z-'f - 1 + 3Z-2 + ^-1(3 + ^-2) 

= Eoiz^) + z-'Ei{z^) 

Eo{z) = 1 + 3Z-1 

Ei{z) = 3 + z-i (22) 

The polyphase architecture for (l + z^^^ easily fol- 
lows from the commutative property applied to the two 
filters Eo{z^) and Ei{z^) in (l22l) . and it is shown in 
Fig. IIJ; along with the architectures for implementing 
both Eo{z) and Ei{z). Notice that the multipliers 
appearing in Eo{z) and Ei{z) can be implemented 
in the form of shift registers as depicted in Fig. HJl. 

The actual complexity of the architecture shown in 
Fig. IIJ? is fully defined once the data wordlength in 
any substage is well characterized, since the power 
consumption of a filter cell can be approximated as the 
product between the data rate, the number of additions 
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- H„(z) 



D 



(a) 



(c) 



(b) D=2-' 

^ (1+ z'r ^{]^ (1+ z')' ^{]^ (1+ z'x 

R R+2 R+5 



d+z')- 
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Eo(z) 


— 


















z' 




12 




E/z) 





Eo(z) 



E/z) 



(d) 




n — 






Z-' 











Fig. 4. Efficient architectures for implementing the decimation stage embedding ffg 2(2) (a)- Non recursive architecture (b); polyphase 
implementation of the decimation stages decimating by 2 (c), and polyphase component implementation using shift registers (d); recursive 
architecture of the decimation filter Hg_2(z) (e)- 



performed at that rate, and the data wordlength. While 
the data rate along with the number of additions 
are well defined, data wordlength in each substage 
in Fig. I4J3 is not. Given the input data wordlength, 
R (in bits), the data size at the output of the first 
decimation substage in Fig. |4j5 is equal to i? + 2 bits 
since two carry bits have to be allocated for the two 
additions involved in that substage. With a similar 
reasoning, data wordlength increases at the output 
of each subsequent substage in Fig. in order to 
take into account the increase of data size due to the 
involved additions. 

As a reference example, if the decimation filter 
depicted in Fig. is the first decimation stage at 
the output of a SA A/D converter embedding a 1- 
bit quantizer into the loop, it is i? = 1. Thus, data 
wordlength is as low as 3 bits after the first decimation 
substage, and so on. 

Let us address the design of a recursive architecture 
for Hs^2{z) in (l20l i. First of all, consider the following 
equality chain 



n 



l + z 



-2' 



■_D-1 



1 - z 



-D 



1- z- 



(23) 

whereby the first equality holds for any D that can 
be written as an integer power of 2, i.e., D = 2^. On 
the other hand, the last equality holds for any integer 



value of D. Notice that decimation factors of the form 
2P are quite common in practice. Upon using ( l23T l with 
t = 3 and D ~ 2^, ( |20| | can be rewritten as follows: 

, 3" 



^8,2 W = 



n-=o 1+^ 



l+z- 



(24) 



l + z- 



The last relation in (l24l) can be simplified as follows: 

(1-2-1)3(1 + 2-1) l-2z-l+2z-3-2-4 ^ ' 

A recursive implementation of filter ffg 2(2) in (IZST i is 
shown in Fig. 15. It is obtained in the same way as for a 
classic cascade integrator-comb (CIC) implementation 
[ 11 1 . In other words, the numerator in dZST l corresponds 
to the comb sections at the right of the decimator 
b}0 D, while the denominator is responsible for the 
integrator sections at the left of the decimator by 
D = 8. 

The derivations yielding dZSl l upon starting 
from (l24l l can also be accomplished by following 
another reasoning based on the following relation: 

1 - z^^" 

(26) 



1 



z 



1 



z 



^Notice that (1 — 2 *)3 becomes (1 — z i)^ upon its shifting 
through the decimator by D = 8. 

^We discuss this other approach for completeness, since it can be 
effective for deiiving an appropriate architecture for other decima- 
tion filter shown in Table IllTl 
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(a) 



Id — - E„(z) — ■ 



E,(z) 



D=8 



Ed-i(z) 



(b) 



z"' 


'i 




z 







z"' 






z 


— -{ 


3- 



E„(z)=l+22z'+9z^ 



EXz)=2+24z'+6z^ 



Fig. 5. Architecture of tlie polyphase implementation of the 
decimation filter //g 2(^) and efficient design of the first two 
polyphase components Eo{z) and Ei{z) (b). 



which is vaHd for any positive n = 2*'^^w with w 
an odd integer. By doing so, (l20l i can be rewritten as 
follows: 

HsA^) = {l + z~')'{l + z-'f{l + z~^)' (27) 



1- z- 



1- z- 



1- z- 



i-z-iy \i-z-^j Vi-^""^ 

Upon simplifying, (l27l i yields dZST l. 

An alternative non recursive architecture stems from 
a full polyphase decomposition of the transfer func- 
tion Hs,2{z)- Upon solving polynomial multiplications 
in ( l20l ). i/g 2(2) can be rewritten as follows: 

20 

Hs.2{^.)=J2Hi)^~' (28) 

1=0 

By applying the polyphase decomposition f34\, 
Hs,2{z) can be rewritten as 

D-l 

H{z) = (29) 

4=0 

= ^h{D-t + i)z-\ Q<i<D-\ 
t=o 

whereby L = 21 is the length of the impulse response. 
The z-transfer function in ( |29] ) is implemented with 
the architecture shown in Fig. |5h- The polyphase 
components Ei{z), < i < D — 1 ~ 7, can be 
easily obtained by employing (|28]) . In particular, the 
first two polyphase components take on the following 
expressions: 

Eo{z) = l + 22z"i + 9z"2 

= (23 + 20)z-i(2 + ^-') + 22z-i + l 

Ei{z) = 2 + 24z-i + 6z-2 

= 2[l + (2i + l)z-i(^-^ + 22)] 

(30) 



An efficient architecture for implementing each 
polyphase component Ei{z) stems from the decom- 
position of each integer as the summation of power- 
of-two coefficients as shown in (l30l l for the first two 
polyphase components Eo(z) and Ei{z). By doing so, 
and employing coefficient sharing arguments, practical 
architectures featuring a minimum number of shift 
registers easily follow as depicted in Fig. [S}?. Similar 
considerations can be employed for obtaining the 
architectures of the remaining polyphase components 
E2{z),...,Er{z). 

VI. Conclusions 

This paper addressed the design of multiplier-less 
decimation filters suitable for oversampled digital sig- 
nals. The aim was twofold. On one hand, it pro- 
posed an optimization framework for the design of 
constituent decimation filters in a general multistage 
decimation architecture using as basic building blocks 
cyclotomic polynomials (CPs), since the first 104 CPs 
have simple coefficients ({ — 1,0,+1}). On the other 
hand, the paper provided a bunch of useful techniques, 
most of which stemming from some key properties 
of CPs, for designing the optimized filters in a vari- 
ety of architectures. Both recursive and non-recursive 
architectures have been discussed by focusing on a 
specific decimation filter obtained as a result of the op- 
timization algorithm. Design guidelines were provided 
with the aim to simplify the design of the constituent 
decimation filters in the multistage chain. 
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TABLE IV 

The first sixty cyclotomic polynomials. 



1 + ^-1 

1 + 2-1 



1+2:- 



+ 2^ 



Eto(-i)^-- 



11 



12 

13 
14 
15 

16 
17 

18 
19 
20 



i=0' 



21 



1 - z" 
-8 . 



-2 " + 2-" 

-1-1-^-2 



1 + 



14 



l + z~ 



1 - 2-^ + 2-* = 
Z^i=0^ 1- 

1 - + - 2—1 + 



-5 
-10 



22 


Ei=o( 


23 


l-z-23 


1-z-l 


24 


1-2-4- 


25 


ELo ^" 



l+^^~ 



1 + 2- 



— 2 
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E"o(-i)'-- 
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